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ABSTRACT 

We propose a data assimilation scheme that produces the analyses for a global and an embedded limited 
area model simultaneously, considering forecast information from both models. The purpose of the 
proposed approach is twofold. First, we expect that the global analysis will benefit from incorporation of 
information from the higher resolution limited area model. Second, our method is expected to produce 
a limited area analysis that is more strongly constrained by the large scale flow than a conventional 
limited area analysis. The proposed scheme minimizes a cost function in which the control variable 
is the joint state of the global and the limited area models. In addition, the cost function includes a 
constraint term that penalizes large differences between the global and the limited area state estimates. 
The proposed approach is tested by idealized experiments, using 'toy' models introduced by Lorenz in 
2005. The results of these experiments suggest that the proposed approach improves the global analysis 
within and near the limited area domain and the regional analysis near the lateral boundaries. These 
analysis improvements lead to forecast improvements in both the global and the limited area models. 



1 Introduction 

Assuming that we have a global model and a regional 
model of higher accuracy defined in a subregion inside the 
global region, we aim to produce a forecast which is better 
than the one from each model by using information from 
both models. We test two data assimilation methods. The 
first method is based on techniques most commonly used 
in current practice and has recently been tested in Merkova 
et al. (2011). In this method, the global and the regional 
data assimilations are done separately, and the regional 
model receives information from the global model through 
the boundaries during the integration phase, but the global 
model does not receive information from the regional model. 
The second method, which we call the joint state method, 
is proposed in this paper. In this method, the global and re- 
gional data assimilations are coupled simultaneously using 
information contained in both the global and the regional 
forecast states, and the regional model receives information 
from the global model through the boundaries during the in- 
tegration phase as in the separate analysis method. We use 
the Local Ensemble Transform Kalman Filter (LETKF) al- 
gorithm for data assimilation. This algorithm allows efficient 
implementation of the localization technique proposed by 
Ott et al. (2004). In order to test our global/regional assim- 
ilation techniques we use numerical experiments based on 
simple atmospheric 'toy' models proposed in Lorenz (2005) 
in conjunction with simulated observations. We compare re- 
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suits of our joint state method and results of the separate 
analysis method. We find that better forecasts are produced 
by using the joint state method than by using the separate 
analysis method. We note that our proposed scheme would 
most likely be of potential interest for centers, where both 
global and limited area forecasts and analyses are prepared. 

The organization of the paper is the following. Section 2 
introduces the atmospheric toy models that we use. Sec- 
tion 3 describes the data assimilation schemes by the joint 
state method and by the separate analysis method. Section 4 
describes how the regional model is coupled to the global 
model at the boundaries of the subregion during the integra- 
tion phases of forecast cycles. Section 5 compares the results 
of our joint state method to those of the separate analysis 
method. Section 6 gives further discussion and summarizes 
our conclusions. 



2 True model, global model, and regional model 

Lorenz (2005) introduced three simple, spatially dis- 
crete, 1-dimensional models that have been proven to be 
useful for testing weather data assimilation methods. Here 
we will use Lorenz's model 2 (which shows smooth propa- 
gating waves) and the more refined Lorenz model 3 (which 
shows small scale activities on top of smooth waves). Lorenz 
model 3 is the following equation for the evolution of a scalar 
state variable Z n at spatial location n, 

dZ n /dt = [X, X] K ,„ + b 2 [Y, Y ] i ,„ + c[Y, X] i, n - X n - bY n + F, 

(1) 

where n is an integer, n = 0, 1, . . . , N — 1, and b, c, and F 
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are parameters, and a periodic boundary condition is used 
(Zn = Zo). The convention of index counting starting from 
is used throughout this paper. iV-component vectors X 
and Y are defined as 

i 

X n = '(a-p\i\)Zn+i, (2) 

i=-I 

Yn Z n — Xn, (3) 

« = (3/ 2 +3)/(2/ 3 +4/), (4) 

/?=(2/ 2 + l)/(/ 4 + 2I 2 ), (5) 

where the prime notation on E' signifies that the first and 
the last terms in the summation are divided by two, and I 
is a parameter. The bracket of any two vectors X and Y is 
defined as 

j J 

[X,Y]k,ti = ' 5^ ' (—Xn-2K-iY„-K-j 

j — — J i= — J 

+ X n -K+j-iYn + K+j)/K (6) 

when K is even, and E' is replace by E when K is odd; 
J = K/2 when K is even, and J = (K — l)/2 when K is 
odd, where K is a parameter. Model 3 reduces to model 2 
when J = 1. In particular, for 1=1, Eq. (2) yields X n = Z„, 
which by Eq. (3) implies that Y n = 0. Thus, after changing 
notation, n — > m and Z n — ¥ Z m , we obtain 

dZ m /dt = [Z, Z] K ,m -Z m + F, (7) 

where m is used to denote a point on the coarser grid of the 
global model. 

We use Lorenz model 3 with parameter values N = 
960, K = 32, fe = 10, c = 0.6, F = 15,/ = 12 to gener- 
ate our simulated true dynamics, and Lorenz model 2 with 
N — 240, K — 8, F = 15 for the global model defined at ev- 
ery fourth grid point of the true model (n = 0, 4, 8, . . . , 956). 
Thus the grid points for Eq. (7) occur at m = n/4, where 
n — 0, 4, 8, ... , 956. We assume that, between analyses, 
Eq. (7) for Z m gives an approximation of the dynamical evo- 
lution of Z n (t) at the grid points n = 4m. When referring to 
locations or lengths of regions, we use the coordinate system 
of the true model throughout this paper (n — 0,1, ... , 959). 
For the regional model, we define a subregion extending from 
n = no = 240 to n = ni = 720 grid, and use Lorenz model 
3 with the same parameter values as the true model. In or- 
der to integrate this regional model, we must evaluate the 
bracket quantities on the right hand side of Eq. (1) defined 
by Eq. (6). For n too close to no (m) this involves X, Y, 
and Z values at grid points outside the subregion, n < no 
(n > ni). Also, from Eq. (2), X n in the regional model (and 
hence also Y n ) depends on Z n i values in n! < no (n' > m) if 
n is within a distance / of no (ni). To evaluate these quanti- 
ties, we use estimates of the required values of Z n i obtained 
from interpolation of the global values Z m onto the n-grid. 
These interpolations essentially play the role of boundary 
conditions for the regional model. 



3 Data assimilation 

We selected 15 evenly spaced observation points start- 
ing from n = (n = 0, 64, 128, . . . , 896). Notice that all the 
observation points are at grid points defined in the global 



model. We construct simulated observations by adding ran- 
dom noise drawn from independent Gaussian distributions 
of standard deviation 1 to the true state values at the ob- 
servation points. 

We compare two data assimilation methods. The first 
method does data assimilation for the global model and the 
regional model separately, while the second method, which 
we call the joint state method, forms a combined state from 
the global model and the regional model and does data as- 
similation on the combined state. The intuition motivating 
our second method is that we expect the global and the re- 
gional estimates will both benefit from information exchange 
between them. We use LETKF for both methods. See Hunt 
ct al. (2007) for an explanation of LETKF. 

For the separate analysis method, we use LETKF with- 
out much modification. For the global analysis, at each grid 
point n = 4m denned in the global model, we define a local 
patch [n — s, n + s] of size 2s + 1 with s — 40, use the Ensem- 
ble Transform Kalman Filter (ETKF) to obtain an analysis 
for the (2s + 1) state values in each patch. This yields local 
patch analyses for each ensemble member. As done by others 
(e.g., Hunt et al., 2007), we then use these patch analyses 
to form the global analysis states for each ensemble member 
by defining the value of the global ensemble field at each 
point m = n/4 to be the analysis state value of that ensem- 
ble member in the center of patch n = 4m. For the regional 
analysis, at each grid point n defined in the regional model, 
we define a local patch, limiting the size near the two bound- 
aries of the subregion so that the local patch is defined only 
inside the subregion, use ETKF, and take the patch analysis 
value at grid point n. Thus the global local patches always 
have size 2s + 1, but the regional local patches have variable 
sizes depending on n. For n located in the subregion and 
also far away from the boundaries, the regional local patch 
has size 2s + 1, while for n near the boundaries (n + s > m 
or n — s < no), the regional local patch is the intersection, 
[n — s, n + s] n [n , m], and has a size less than 2s + 1. 

For the joint state method, we use the same local patch 
size, s = 40. For each grid point n defined either in the 
global model or in the regional model, we define a global 
local patch and a regional local patch (where, as before, the 
regional patch is the intersection, [n — s,n + s] n [no,ni], 
which for some n = 4m will be empty). For each such grid 
point n, we define a vector Xg by taking state values of 
the global local patch, and xj.™' by taking state values of the 
regional local patch, and we then form a local joint state 
vector x^™' by concatenating Xg and xf n ', i.e., 

( ("A 

x<»> = (J,,] • (8) 

We also form a local observation vector yi™- 1 by taking obser- 
vations in the local patches (from grid point n — s to n + s). 
We define a local cost function J^ n '(pc- n ') for grid point n 
as follows, 

j(")( x (")) = ( x («) -x£ n) ) T (p£ n) )- 1 (x (n) -x' n) ) (9) 

+ [y<, n) -H (n) (x w )] T R- 1 [yi n) -H (n) (x (n) )] 

+ k [g<"»(x("») - G(" ) (x(" ) )] T [g<"»(x("») - G^x^)] , 

where x^ and P^" are the local mean and the covariance 
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matrix of the background ensemble, respectively, H' n ' (x*-™') 
is a local observation operator denned as 



H 



(1 — A) x gjJ (s) + A x r>J (j) , 



if n < < m; 
otherwise, 

(10) 

where is the observation location of the i th observation 
in the local patch, x g ju-) is the global state value at location 

and av.j'fi) ^ s the regional state value at location 
Gg (xg ) is a vector that consists of the state values of 
the global state at the grid points defined both in the global 
and the regional local patches. Similarly, G^™' (xj. ) is a vec- 
tor that consists of the state values of the regional state at 
the grid points defined both in the global and the regional 
local patches, k and A are parameters. The third term is 
a constraint term that penalizes large differences between 
the estimates of the global and regional model states. We 
determine the value of x that minimizes the cost function 
J (n) (x (n) ) with the LETKF algorithm (Hunt et al., 2007). 

In general, if our technique were to be applied in an 
operational setting, the grid points of the global and the re- 
gional models within the subregion will not coincide. In that 
case, to calculate the third term in J>"'(x'"'), an interpola- 
tion from the grid points of the regional model to the grid 
points of the global model or vice versa could be employed 
before the values of the regional and the global models are 
subtracted. Similarly, in an operational setting the observa- 
tions are not at grid points, and H^™- 1 would then include 
interpolation. 



4 Model integration 

We define a smoothed regional state for the initial con- 
dition of the regional model for integration between analysis 
times as follows. After the analysis phase, we define spatial 
transition intervals of length 10 starting from the bound- 
aries and ending inside the subregion. We then modify the 
regional analysis values in the transition intervals by taking 
weighted linear averages of the global analysis values and 
the regional analysis values. We do this in order to make the 
transition between the global model and the regional model 
smooth at the boundaries. For n such that ^ n < 10, we 
modify the regional ensemble members by 

X r Kno+n <- (n/10) + (1 - n/W) (11) 

X r k , ni . n <- (n/10)X£, ni _ n + {l-n/W)Xl ni _ n , (12) 

where n and X£ n are the values of the k th global and 
regional ensemble members at grid point n, respectively, and 
the subregion for the regional model is [no, ni] = [240, 720]. 

After performing the above smoothing process, we in- 
tegrate each global and regional ensemble members for 6 
hours using a fourth-order Runge-Kutta method, dividing 
6 hours into 24 time steps. We integrate the global ensem- 
ble members independent of the regional ensemble mem- 
bers. For the integration of the regional ensemble members, 
we use the necessary interpolated values of the correspond- 
ing global ensemble members outside the subregion at each 
Runge-Kutta time step to synchronize the global and the 
regional model at the boundaries. 
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Figure 1. Rms errors of the state estimates of the separate anal- 
ysis and the joint state analysis using the whole region for both 
the global and the regional models. The rms-error values were 
averaged over 10000 forecast cycles, discarding the values of 1000 
initial cycles. The green and the black colors correspond to the 
global and the regional values obtained using the separate analy- 
sis method. The blue and the red colors correspond to the global 
and the regional values obtained using the joint state method. 



5 Results 

Before we tested the joint state method and the sep- 
arate analysis method, we ran forecast cycles with 40 en- 
semble members using the global and the regional models 
separately and found that multiplicative covariance infla- 
tion factors of 0.024 and 0.02 for the global and the regional 
analyses, respectively, produce the lowest rms state estimate 
errors. We henceforth use these values in our data assimila- 
tions. For the joint state method, we found that A = 0.9 and 
k = 0.04 in Eqs. (9) and (10) give the lowest rms state esti- 
mate errors, and we use these values in all of our subsequent 
applications of the joint state method. 

We first tested the separate analysis method and the 
joint state method without boundaries. That is, we used 
the whole region for both the global and regional models. 
Thus, there is no coupling between the global model and 
the regional model at the boundaries during the integration 
phases. In this setup, aside from the correlations induced 
by common observations in their assimilations, the separate 
analysis method corresponds to having independent global 
and regional forecasts. For the joint state method, the cou- 
pling between the global and regional models occurs only at 
the analysis phases. Figure 1 shows the rms errors of state 
estimates given by the means of the ensemble members as 
a function of the grid point. The values were averaged over 
10000 forecast cycles, discarding the values of 1000 initial 
cycles. The green and the black colors correspond to the 
global and the regional values obtained from the separate 
analysis method. The blue and the red colors correspond to 
the global and the regional values obtained from the joint 
state method. Error minima occur at the observation points. 
The figure shows that the two regional rms errors are almost 
the same, while the global rms errors from the joint state 
method are much lower than the global rms errors of the sep- 
arate analysis case indicating that, as one would expect, the 
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Figure 2. Rms errors of (a) the state estimates (b) 1 day forecasts 
of the separate analysis and the joint state analysis. The color 
scheme is the same as in Fig. 1. The additional purple curves 
show the rms-error values when assimilations were done globally 
using the true model (Lorenz model 3). The two vertical dashed 
lines at grid points 240 and 720 indicate the boundaries of the 
subregion. 



information from the regional model substantially improved 
the estimate of the global model. 

Now, we take a subregion [nn,ni] = [240,720], and in- 
troduce coupling between the global model and the regional 
model at the two boundaries during the integration phase. 
Figures 2(a) and 2(b) show the rms errors of the analysis and 
of a 1 day forecast, respectively, using the same color scheme 
as in Fig. 1. The two vertical dashed lines at grid points 240 
and 720 indicate the boundaries of the subregion. The ad- 
ditional purple curves show the rms-error values in the per- 
fect model scenario in which the forecast model was the true 
model (Lorenz model 3) which was used globally throughout 
the entire space. We view this as setting a standard for the 
best that could ever be done. These figures show that the 
joint state method performs better than the separate anal- 
ysis method for both the global prediction and the regional 
prediction. We note that the global forecast obtained from 
the joint state method is better than the corresponding one 
from the separate analysis method even outside the subre- 
gion. This can be explained by the fact that the better global 
state estimates inside the subregion at the analysis phases 
can make better forecasts outside the subregion during the 



integration phases, and these better forecasts outside the 
subregion can make the regional forecasts better inside the 
subregion by providing better information at the boundaries 
during the integration phases. We also note that the global 
analysis improvements that result from use of the joint state 
method are greater to the right of the subregion than to its 
left. This is consistent with the fact (Lorenz, 2005; Yoon 
et al., 2010) that, for these models, waves (and hence the 
information they carry) have group velocities that are pre- 
dominantly rightward. 



6 Discussion and conclusion 

In this paper we formulated a joint state method for re- 
gional forecasting. Using simulations employing simple mod- 
els, we have numerically tested our method by comparing 
analysis and forecast results obtained using our method with 
results obtained using a separate analysis method. We found 
that the global forecast in the whole region and the regional 
forecast in the subregion are both noticeably improved when 
the joint state method is used compared to when the sepa- 
rate analysis method is used. 

This work suggests several topics for future work. Most 
importantly, will the encouraging results from experiments 
using our Lorenz model set-up continue to apply when tests 
on real situations are done? What is the effect of regional 
model error? What are the benefits of applying our coupled 
analysis scheme to situations with multiple (perhaps over- 
lapping) regional analyses? 
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